8 research outputs found

    Do optimization methods in deep learning applications matter?

    Get PDF
    With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

    A cluster based approach to reduce pattern layer size for generalized regression neural network

    No full text
    WOS: 000446742400009Generalized Regression Neural Network (GRNN), is a radial basis function based supervised learning type Artificial Neural Network (ANN) which is commonly used for data predictions. In addition to its easy modelling structure, being fast and producing accurate results are the other strong features of it On the other hand, GRNN employs a neuron in pattern layer for each data sample in training data set. Therefore, for huge data sets pattern layer size increases proportional to the number of samples in training data set, memory requirement and computational time also increase excessively. In this study, in order to reduce space and time complexity of GRNN, k-means clustering algorithm which had been used as pre-processor in the literature is utilized and outlier data emergence which affects the performances of previous studies negatively, is prevented by identifying test data located between clusters. Hence, while memory requirement in pattern layer and number of calculations are reduced, negative effect on the performance emerged by the use of clustering algorithm is significantly removed and almost the same prediction performances to that of standard GRNN are achieved by using 90% less training samples

    Pattern Layer Reduction for a Generalized Regression Neural Network by Using a Self–Organizing Map

    No full text
    In a general regression neural network (GRNN), the number of neurons in the pattern layer is proportional to the number of training samples in the dataset. The use of a GRNN in applications that have relatively large datasets becomes troublesome due to the architecture and speed required. The great number of neurons in the pattern layer requires a substantial increase in memory usage and causes a substantial decrease in calculation speed. Therefore, there is a strong need for pattern layer size reduction. In this study, a self-organizing map (SOM) structure is introduced as a pre-processor for the GRNN. First, an SOM is generated for the training dataset. Second, each training record is labelled with the most similar map unit. Lastly, when a new test record is applied to the network, the most similar map units are detected, and the training data that have the same labels as the detected units are fed into the network instead of the entire training dataset. This scheme enables a considerable reduction in the pattern layer size. The proposed hybrid model was evaluated by using fifteen benchmark test functions and eight different UCI datasets. According to the simulation results, the proposed model significantly simplifies the GRNN’s structure without any performance loss
    corecore